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Abstract 



In order to assess the potential user population for Project TALEtlT 
data and to identify barriers to usage of the data for secondary analysis, 
a small telephone survey of leading large-scale survey researchers was 
undertaken^ Because of the dearth of relevant findings, the survey *^as 
designed to shed light on barriers to secondary use of large data bases 
in general as well as on barriers to use of TALENT data. The survey was 
also designed to elicit suggestions for strategies for reducing barriers 
to use of the data. 

Four levels of barrier were considered: unawareness of the exis- 
tence of the data base, negative attitudes about secondary analysis, 
specific difficulties or deficiencies of tlie data base, and cost of data 
processing. In the case of Project TALENT, most researchers were aware 
of its existence, but few had a clear idea of the scope of information 
contained in the data base. Half of the respondents expressed some form 
of negative feeling about secondary analysis in general, and a folklore 
myth that the TALENT data are severely biased was also uncovered. When 
the data base was described to the respondents, most felt that useful 
data were present and many expressed interest in future use of the data. 
For established researchers, costs at the level required for use of 
Project TALENT data were not perceived as a barrier, although they might 
be for beginning researchers. 

Recommendations obtained from the respondents for improving usage 
generally matched the steps recently taken by the TALENT staff (see 
Chapter 9 of this report), although some areas were identified where 
additional effort was needed (e,g,, publishing more articles based on 
Project TALENT), Finally, continuation of activities developed in this 
survey is desirable both to refine our knowledge about secondary data 
usage patterns and, as a side effect, to heighten interest in secondary 
analysis of TALENT data. 
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AN liXPLORATORY STUDY OF BARRIERS TO USAGE OF 
LARGE-SCALE DATA BASES SUCH AS PROJECT TALENT 



In troduc tion 

Large-scale data bases in the social and behavioral sciences pro- 
vide tremendous advantages, both in terms of cost and in terms of the 
opportunity for addressing important research cjuesCions that are nor- 
mally beyond the reach of most investigators. As such, they should be 
viewed as a data resource — a means of acquiring key information at a 
fraction of the cost of gathering new data* The Importance of secondary 
analysis as a research tool has been discussed by several authors 
(Koehler, 1977; Glass, 1976; Hyman, 1972), but the methods of matching 
researcliers to data have not yet been thoroughly explored. The American 
Institutes for Research (AIR) feels a responsibility for taking positive 
action to promote secondary analyses of Project TALENT, a major data 
base that it maintains, and toward this end the TALENT staff has been 
engaged in a serious effort to make Project TALENT data more accessible 
to researchers. 

This report describes a study of the opinions and attitudes of 
professional survey researchers concerning the potential for utilizing 
large-scale data bases such as Vroject TALENT- The intent of the study 
was to identify the barriers that influence secondary analyses and make 
specific recommendations to encourage further use of existing data 
resources - 

If a data resource* is to be utilized as a data bank and be readily 
available to other researchers, then there are clearly responsibilities 
for its managers. These include the maintenance of a qualified staff of 
consultants, the dissemination of technical information, the cataloging 
of uses of the data, the review of user needs and problems, and the 
periodic upgrading of the contents of the data bank and the services 
provided (Nasatir, 1973). 
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What is less obvious is the extent to which the managers of a data 
base should be expected to go to encourage the use of their data. 
Merely making the data available may be professionally acceptable, but 
this does not solve the problem; without extensive publicity and dis- 
semination efforts, very few other researchers are likely to use a data 
base* Unfortunately, encouraging outside researchers to make use of a 
large-scale data base is no easy task; as will be discussed in this 
report, there are definite barriers to the use of large-scale data bases 
that have to be overcome, both by the prospective users and by the data 
base managers. 

The problem of data dissemination is particularly difficult in the 
social sciences, when attention to individual respondents* rights to 
privacy must be given highest priority. Merely deleting a respondent's 
name from the data record does not preclude his/her identification in 
terms of a unique combination of experiences (e^g, , birth date, college 
major, occupation, number and age of children, military experience). 
Responsible dissemination must recognize tliis problem and deal with it + 
In the case of Project TALENT, data have been released only to research- 
ers for their own u£;e and only after they have agreed in writing not to 
use the data in any way that would identify individual respondents. 

This report addresses the issue of how best to promote widespread 
use of large-sicale data bases, examining the perceived and actual 
facilitators and barriers to the use of a particular data base* The 
report focuses on the Project TALENT data base, a large-scale data 
resource that is potentially useful to a broad spectrum of researchers. 

Background Information on the Project TALENT Data Bank 

Project TALENT is an ongoing nationwide study of some ACO,000 
American men and women who were in high school in I960, It has been 
supported through the years first largely by the U,S, Office of Educa- 
tion and, more recently, wholly by the National Institute of Education, 
The TALENT data base Includes measures of the cognitive skills. Inter- 
ests, plans, family backgrounds, and current activities of the original 
1960 sample, plus data collected one year after high school on over 



190,000 of its participants, data collected five years after liigli school 
on over 130,000 of its participants, and data collected eleven years 
after high school on over 95,000 of its participants. Plans are under 
way for a seventeen-years^af tei:-high-school follow-up survey that will 
extend this series of longitudinal studies of the educational, occupa^ 
tional, family and life style history of this representative cross- 
ejection of Aaiierican society. 

In order to facilitate the use of TALENT data by the research 
community, the American Institutes for Research maintains the Project 
TALENT Data Bank. Data Bank staff are available Co consult with research- 
ers to determine their particular needs, to interpret them in terms of 
possible Project TALENT contributions, and to supply rapes or carry out 
the necessary computer runs and analyses. Preliminary planning for the 
use of Pro^lect TALENT data has been facilitated by publication of a 
comprehensive Data Bank Handbook which describes the data base, the 
procedures used for sampling and data collection, and how to locate and 
specify variables of interest (Wise et al., 1977).* 

Between 1960 and 1977, over 200 studies, conducted both by TALENT 
staff and by outside researchers, have utilized Project TALENT data. 
The data have been used to support Congressional testimony on vocational 
education, higher education, guidance and counsjeling, and fertility 
questions. Studies have been made of the interactions among race, sex, 
socioeconomic status, and various ability measures, and of their effects 
on subsequent educational and career attainment. TALENT data have been 
used to assess the effects of school characteristics, curriculum and 
guidance opportunities, career preferences, and marital and family 
history on postsecondary education and career success. As one researcher 
said, **the best currently available evidence about high schools' effects 
on their students is found in survey data collected by Project TALENT" 
(Jencks et al., 1972, p. 89). 



*The Handbook may be obtained for $6.00 from the American Institutes for 
Research, P.O. Box 1113, Palo. Alto, California 9A302. Readers wishing 
further information regarding Project TALENT or the Project TALENT Data 
Bank are invited to contact either Dr. Lauress Wise, Deputy Director of 
Project TALENT, or Dr. Donald McLaughlin, Director of the Project TALENT 
Data Bank, at the American Institutes for Research. 
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In addition to Jencks' study of che effects of family and schooling 
on inequality, several other major studies have made use of Project 
TALENT data. For example, che recent highly publicized study conducted 
by the Educational Testing Service concerning the decline in test scores 
(Turnbull et al., 1977) relied heavily on Project TALENT data. The list 
of publications by John Flanagan and others on the Project TALENT staff 
includes dozens of separate references. 

Previous Research on the Use of Large-Scale Data Bases 

Before presenting our own findings concerning potential barriers 
and facilitators to the use of large-scale data bases, the results of 
other similar studies should be mentioned. Ennis (196A) found that only 
21% of the data sources used in articles published in 15 major American 
sociological journals in 1962 had led to other publications. Bell 
(1970) pointed out that not only do the majority of social scientists 
avoid secondary analysis of data, but those who do use secondary anal- 
ysis tend to be situated at the larger colleges and universities. Only 
one of Bell's 15 respondents stated that he had first learned of a data 
source through a publication; eigUt discovered the existence of their 
data through lists published by daca archives; and six found out about 
the data through casual conversations or correspondence with colleagues. 
Ten of these 15 said that they were disappointed with the utility of the 
data once they received it. Bell concluded chat prospective users tend 
to be uninformed about the contents of a secondary data source, they 
often overestimate the costs of data acquisition, and they may simply 
not have the skills required for secondary analysis. The barriers to 
secondary data use that he found were inefficient search techniques in 
locating data sources, lack of information about the data, and poor 
quality in data coding- 
Babbie (1973) enumerated some of the responsibilities of the orig- 
inal researcher when a data base is released for secondary analysis. 
First, the original researcher should prepare a methodological report on 
his study, not only indicating the manner in which the study was con- 
ducted but also pointing to special strengths and weaknesses in the 
data. Second, the original researcher should request copies of reports 
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prepf^red from his data tn order to review them for inaccuracies aiKJ 
misinterpretationSf Finally, the original researcher should challenge 
a:iy misuses of his data if* secondary analyses- 



Kyman (1972) provided probably the richest source of Information 
concerning the problems of using an existing data source- He cites lack 
of training in secondary analysis techniques as a major obstacle, as 
well as the time and effort involved, lack of awareness of data sources, 
delays in obtaining data, costs involved, the quality of the data and 
its documentation, and the absence of key variables. The benefits of 
secondary analysis that he mentioned are that it economizes on time, 
money, and personnel, reduces the intrusion into the lives of subjects 
and respondents, provides a training ground for beginning researchers, 
allows studies involving changes over time or multinational settings, 
affords the pos55ibility for multiple replication of survey findings, and 
compels researchers to think more broadly and abstrcictly about their 
workf 

Hyman's examples of successful attempts at secondary analysis 
provide an excellent description of the target population both for this 
study and for future dissemination efforts concerning Project TALENT 
data. The characteristics of researchers who are likely to perform 
secondary analyses, he found, include the possession of varied and broad 
interests, a tolerance of minor imperfections In a data source^ sens^l- 
tivity to the opportunities that are presented, a wide network of infor- 
mation sources, an understanding of how completely a vein of Information 
can be worked and when to move on to other data, an open mind as to what 
directions one's inquiries should take, and an ability to see the larger 
potential of even minor indicators^ 

Hyman characterized the experienced secondary analyst as a person 
who is able to cope with the obstacles normally standing in the way of 
adequate treatrnent of error, concerned about the ambiguity or invalidity 
of his indicators as instruments for the measurement of particular 
variables, and familiar with the often complex designs of survey re- 
search> Above all, secondary analysts must avoid being either too rigid 
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in their data requirements or too unsystomatic in thoXr searcli for new 
data~"they are purposeful in Cliolr searcli, but still relatively ea^y to 
please^ Since their pursuit is likely to be rewarded, their affection 
(for the data] grows* As the relationship with the archive persists, 
they become more familiar with [its] many charms and the ^Juirks In [its] 
data, and progressively more skillful in their dealings* Each later 
encounter is simpler and easier* Thus a fruitful and gratifying rela- 
tionship [between researcher and data base] develops" (Hyman, 1972, p+ 
79), 



Focus of the Current Report 

As part of the Project TALliiNT activities funded under CIrant No, 
MIE-G-7A-0003, the current study was undertaken to determine thf^ factors 
tliat inhibit the use of data bases such as Project TALENT, This study 
addd to the work of Hyman (1972), Nasatir (1973), Babbie (1973) and 
others who have addressed the problems of access to large-scale data 
bases and secondary analysis » 

The current study focuses specifically on obtiaining the informed 
opinion of experienced survey researchers to address the following 
quest lons; 

1- How knowledgeable are researchers about Project TALENT and its 
data base? What are thoir sources of information? 

2. What are the most effective methods for disseminating infor- 
mation about Project TALENT and encouraging the use of its 
data base? 

3- What are the barriers that need to be overcome before re- 
searchers will want to use Project TALENT data? Who would be 
most likely to use the data? 

A, What can Project TALENT staff do to increase the use of TALEtJT 
data by other researchers? 



The remaining sections of this report describe tlie procedures followed 
in carrying out this study, the results obtained, and the implications 
for managers and sponsors of large-scale data bases* 



Methods 



Sample 

Rather than restrict the sample solely to secondary analysts, it 
was decided that researchers in the social sciences who were experienced 
in the use of large-scale survey data should be the primary source of 
respondents* In additiion, efforts were made to contact others who were 
knowledgeable about the use of data banks and the funding of research 
projects based on existing survey data. Ninety-five potential research 
respondents were identified from the educational, psychological, socio- 
logical, vocational, and demographic literature for the past two years 
as authors who had recently conducted studies using survey data similar 
to those available In Project TALENT, Mail and telephone responses were 
obtained from 50 individual researchers; IS sociologists, nine survey 
experts, seven economists, five labor experts, five involved in popu- 
lation studies, three policy researchers, two educators, and one psycho- 
logist. Of these, four had previously used TALENT data* In addition, 
sixteen representatives of Federal funding sources were considered as 
possible survey respondents, and five of these were contacted by tele- 
phone* Thus a total of 55 interviews were conducted. 

Instruments 

t\n interview protocol was devised to elicit comments from the 
respondents regarding the use of Project TALENT and other similar data 
bases* The open-ended questions were intended merely to guide the 
conversation in order to capitalize on the personal opinions and sugges- 
tions of the respondents* Although the emphasis was on respondents' 
reactions to Project TALENT, they were also given the option of talking 
about large-scale data bases in general* Respondents were asked about 
their previous sources of information about Project TALENT, the nature 
of their current research, the problems they had encountered in using 
large-scale data bases such as Project TALENT, the suggestions that they 
had for increasing, the use of Project TALENT data, and the topical areas 
that they most wanted to see included in the next TALENT survey (see 
Appendix A for a copy of the interview guide). 
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A slightly different interview guide was used to obtain coinments 
from the five Federal officials. These respondents were asked to de- 
scribe their sources of information about Project TALENT, the kinds of 
survey research efforts that they sponsor, the barriers that prevented 
their funding research using large-scale surveys, and what they thought 
Project TALENT could do to increase its relevance to government research 
needs (see Appendix B for a copy of the Interview Guide). 

Procedures 

After the pool of respondents had been decided upon, they were sent 
an introductory letter containing an explanation of the purpose of the 
study, a brochure describing Project TALENT, and the set of questions to 
which they were requested to respond. Of the 111 individuals who had 
been mailed questions, five respondents mailed in their replies, and the 
rest werti contacted by telephone. Three declined to be interviewed, and 
53 could not be reached by telephone within the time limits of the 
survey. Thus responses were obtained from a total of 55 individuals. 

Respondents were first asked if they had received the mailed ver- 
sion of the survey and if they had formed some opinions concerning it, 
(Virtually all of the respondents had received the mailed version, but 
most had not taken the time to formulate their responses,) They were 
then asked each of the questions in turn, with occasional prompts in- 
serted to stimulate critical opinions, e,g,, "Do you feel costs to be a 
barrier? — What about support services?" At the end of each telephone 
interview, notes on the conversation were recorded for later analysis. 

After all of the telephone interviews had been completed and tran- 
scribed, the responses to each question were consolidated into major 
response categories to facilitate interpretation of the findings. 
Especially constructive suggestions and criticisms were also extracted 
for further action* 




Results 



Respondents' Avjareness of Project TALENT 

Table 1 presents the responses to the first questions of Interest, 
whether the respondents had heard of Project TALENT before and what 
cheir sources of information about Project TALENT were. As might be 
expected of a sample of survey researchers, nearly all of the respon- 
dents claimed at least to have heard of Project TALENT before. Thus, if 
Che results of this survey generalize to the population, it appears that 
lack of awareness of Project; TALENT is not preventing researchers from 
making use of che Data Bank, but thac mere awareness in icself is not; 
sufficient to lead to use of Project TALENT. 

It vas difficult for niost respondents to describe accurat;eiy whac 
t;heir sources of information on Project TALENT were, Tliey could recall 
having talked to other researchers about TALENT findings or encountering 
references to TALENT in the research literature, but few could remember 
a specific reference. Only those who had a personal interest in the 
TALENT data and who had obtained copies of Project TALENT reports showed 
any real familiarity with the study. Because the term "talent" has been 
applied to many endeavors, including the HSF Talent Search, it is 
possible that some respondents who claimed awareness were mistaken. 

Barriers to Use of a Pata Base Mentioned by Respondents 

Table 2 presents a summary of the cotmnencs of the respondents 
regarding the barriers that they saw to the use of large-scale survey 
research data like Project; TALENT by outside researchers. Their re- 
sponses have been grouped into five major areas of concern: 

1, unfamiliarity with the data base and its contents, 

2, attitudes against secondary analysis, 

3, difficulties in carrying out secondary analysis. 
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Table 1 

Respondents* Awareness of Project TALENT^ 



Ajnnun C of Information 



Respondents 



Had not been aware of Project TALENT 

Generally aware of the project; heard about it through colleagues 
Had read articles about Project TALENT 
Had read TALENT publications 



% of 





Total 


5 


cm) 


^0 


(67%) 


22 


(49%) 


11 


(24?;) 



*Limited to the A6 respondents who had not used Project TALENT data. The four previous TALENT users 
indicated that in addition to conversations with colleagues and .lonrnal articles, they had read 
Project TALENT publications such as the Data Bank Handbook in the course of their research. 
Five federal respondents were excluded. 
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Table 2 

Barriers to Use of a Data Base Mentioned by Respondents 

N=50 



N % 



Unfamillarity with the data 50 (lOQZ) 

Data bases generally 13 (26^) 

TALENT in particular 37 (7A%) 

Attitudes against secondary analysis 25 (50^) 

Suspicion about data validity 13 (26^) 

Desire to collect own data 17 (34%) 

Difficulties in secondary analysis t(8 (96^) 

Absence of key variables 33 (66%) 

Quality of the data 21 (42%) 

Nonaccess to raw data 22 (44%) 

Honrepresentativeness of the sample 14 (28%) 

Complexity of the data base 7 (14%) 

Costs of secondary analysis using large data bases 31 (62%) 

Obtaining funding 10 (20%) 

Expenses of data acquisition 23 (46%) 

Poor services provided by the data bank staff 31 (62%) 

Quality of the services 23 (46%) 

Quality of the documentation 13 (26%) 
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costs of acquiring information from a data bank, and 



5* poor quality of data bank services* 

Each of these areas is discussed below* 

Unf amiliarity with the data. It appears that the respondents 
viewed a lack of knowledge about a data base and its contents as the 
major barrier to outside research. Seventy-four percent of the respon^ 
dents said or implied that this was the case with Project TALENT, and 
26% made a general reference to unf amillarity as being a barrier to 
secondary analysis* 

Although information about Project TALENT and its Data Bank activ- 
ities has been published in many places, the level of detail necessary 
to prepare a researcher to make use of the Project TALENT data base is 
contained only in the Data Bank Handbook, Given the relatively low 
level of distribution of the Handbook, it is not surprising that so many 
respondents did not consider themselves especially familiar with Projec t 
TALENT* Even respondents who had once been knowledgeable about TALENT 
admitted that they were not familiar with the contents of the more 
recent follow-up surveys* As one respondent remarked, "The problem with 
TALENT is that you don't have anything like the Coleman Report to point 
to+". ,His concern was that without a definitive or landmark publication 
available, it would be difficult for most researchers to see a link 
between Project TALENT and their o^jn research interests. 

Further support for this finding was evident in the reactions of 
respondents who showed active interest in using Project TALENT data for 
their own research. Although most had not yet made a direct effort to 
obtain TALENT data, their interest in doing so increased as they were 
made aware of the availability of variables pertinent to their research* 
This indicates that researchers need to feel knowledgeable about the 
contents of the Project TALENT data base before they will consider it 
for their own use. 




Attttudes toward secondary analysis . The second set of barriers, 
those dealing with attitudes toward secondary analysis, also appeared to 
be a substantial barrier to the outside use of a large-scale data base. 
Fifty percent of the respondents mentioned a concern about the validity 
of the data or a desire to work directly with one*s own data. Several 
respondents indicated that because they w re accustomed to collecting 
their own data and carrying out their own analyses, the use of an addi- 
tional data source did not appeal to them. To quote one respondent, the 
advantage of creating one's own data base is that have the data — not 
analyses someone has run for me* During the course of any one day I am 
likely to develop new questions, attack the data from new angles, and 
generally change in response to the data/* 

The predisposition of many researchers to avoid using someone 
else's data conflicts with the obvious fact that it is often far more 
efficient to utilize an existing large-scale data base than to attempt 
to create a new small one. Although it depends upon the individual's 
particular research requirements, e.g., whether he absolutely needs a 
certain set of variables, many researchers do not like to compromise 
their research design simply to conform to the nature of existing data. 
There appears to be a feeling of satisfaction in creating and control- 
ling one's own data sources that secondary analysis cannot match. 
However, if a data base contains information that meets a researcher's 
unique needs, this barrier would be considerably reduced. 

Difficulties in secondary analysis . Nearly all of the respondents 
pointed out at least one potential difficulty with the conduct of a 
secondary analysis, especially when using a large-scale data base. The 
absence of variables essential to a researcher's specific interests was 
the most frequently mentioned problem, but concern was also expressed 
about the quality of the data, the inability to access raw data, pos- 
sible nonrepresentativeness of the sample, and the complexity of the 
data base. Many of these same problems can also occur in primary anal-- 
ysis, of course, and the implication is that researchers insist on 
higher standards of quality when they use someone else's data than they 
demand of their own data. 



19 

lA 



Most of the respondents had at one time or another performed at 
lease one secondary analysis using a large-scale data base, such as 
census data, so chey tended to be familiar with Che problems of such 
data bases. They did, however, differ considerably in Che degree of 
intensity with which chey regarded these potential difficulties as 
actual barriers to research- The respondents tended to view the process 
of getting acquainted with a data source and modifying it to meet their 
needs as a necessary step in secondary analysis; their main concern was 
that a data base actually contain key research variables and represent 
the population of interest. 

In talking with respondents who were interested in the possibility 
of using Project TALENT data regarding actual difficulties in using 
Project TALENT, there were only a few areas where respondents felt that 
the measures obtained from TALENT participants were inadequate to arouse 
their research interest* This was partially due to the fact that the 
respondents were chosen from among those doing research in areas for 
which Project TALENT might be relevant. Those respondents interested in 
demographic cnanges affecting the TALENT samples (mobility, number of 
children, marital status, etc*) indicated that the inclusion of such 
items in the 17-year follow-up survey would increase the attractiveness 
of the data ba^i^ for them* Also, several of the economists expressed an 
interest in obtaining more complete information on income and expendi- 
tures of the respondents, an addition that is considered for the 17-year 
follow-up survey* 

Much to our surprise, those respondents who were critical of the 
quality and representativeness of the Project TALENT data base were 
apparently not aware oi the extensive efforts that have been made to 
maximize the data quality and to ascertain the precise degree of bias 
attributable to nonresponse on key variables and to attrition effects in 
general- There seems to be a folklore about Project TALENT that insists 
that the sample is biased — a charge that must be addressed more vocif- 
erously than in the past* Perhaps, such titles as "When is Bias 
Better?" should be avoided as they relate to TALENT because they tend to 
reinforce superficial attitudes about the representativeness of TALENT* 
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Although several technical descriptions of the proper statistical 
adjustments to the data exist, including within the TALENT Data Bank 
Handbook, they do not appear to be coTninon knowledge; instead, what seems 
to be a more prevalent (yet illogical) attitude is that the size and age 
of the TALENT data base compounds rather than corrects for such problems 
with the data. 

Costs of secondary analysis using large data bases* A fourth set of 
barriers to the use of a data base is the costs involved. Depending on 
the perspective of the researcher , this may or may not be a critical 
factor; as might be expected, older, more experienced researchers saw 
little personal difficulty in obtaining the funds necessary to cover 
data processing and analysis, whereas it was generally agreed that 
beginning researchers would find all but minor costs a definite barrier. 

There were a number of complaints about the current costs of 
analyses using TALENT data, simply because the respondents liad their 
own, less expensive sources of data processing* University-sponsored 
computer time and consulting services, not to mention access to graduate 
student labor, make it preferable for many researchers to obtain copies 
of the raw data and do their own processing and analysis. Therefore, 
for these respondents providing worktapes, as Project TALENT does, is a 
more attractive inducement to use the data. On the other hand, two of 
the respondents without such resources at their disposal said that If 
they were to use TALENT data, they would prefer to have TALENT staff do 
as much of the computational work as possible. 

Poor services provided by the Data Bank staf f > The fifth set of 
barriers mentioned was with respect to the quality of the services and 
documentation provided by those responsible for a data base. Potential 
problems included mistakes in processing, over expenditure of funds, time 
delays, excessive paperwork, incomplete answers to questions, and cryp- 
tic or erroneous documentation. These barriers were described by the 
respondents in terms of what they hoped to avoid in using a particular 
data base, rather than as specific criticisms. 
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Strategies Sii^^sted by Respondents for Increasing Use a Data Base 



Table 3 lists the suggestions made by the respondents as to what 
Project TALENT could do to increase the use of its data^ The most 
frequently mentioned suggestion was that TALENT increase dissemination 
of descriptions of its data base; however, respondents did not show much 
agreement on the most effective methods Direct mailing to researchers 
was mentioned, as were selective mailings to heads of departments, data 
archivists and survey research experts^ Recommendations on the size and 
contents of the literature to be sent ranged from shorty general de- 
scriptions to a complete data handbook* 

One of the problems of disseminating information about a data base 
is that relatively few researchers would be likely to read a complete 
description of the data, yet they would not be personally satisfied by a 
global description* Those respondents actively interested in using 
TALENT data made it clear that they wanted a full, variable^by-variable 
description of the data base* One respondent, a data archivist, pointed 
out that tlie level of detail desired by a researcher is closely related 
to the Intensity of his or her interest in that kind of data: a corol- 
lary of this is that researchers tend to wait until the onset of their 
research before making extensive inquiries about a data base* 

Respondents appeared to be more impressed by person-to-person, 
researcUer-to-researcher contact regarding a data base than they were by 
written presentations. It was suggested that Project TALENT follow this 
procedure wherever feasible, such as in presentations and demonstration 
booths at conventions and seminars about accessing TALENT data, as well 
as our current policy of free consultation about prospective research* 

Another suggestion was to provide Public Use Samples of the Project 
TALENT data, such as the one already on file with NIE, for the cost of 
the computer tape containing them and the necessary documentation* 
Respondents noted that this would (1) make Project TALENT directly 
accessible to a large number of users, (2) allow users with limited 
funds or their own computer resources to use the data inexpensively, and 
(3) allow researchers to do preliminary studies of the data before 
requesting analyses of the entire sample* 
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Re ST^ondent s ' Supee s tlons for Increaslnfi: the Use of Pro i ect 
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Uesponden ts 




N_ 


A. 


Provide detailed doctjinen ta tion of datA saiTiDles 


31 


(62%) 


Tnf^Te^iQe General Hie coTTi^ nJit'Trtn rtf "Jn f rtrmja 1" "J rtn hr* lit* Pro T eC t TAT_.FNT 


27 


(54%) 


Cont act potent ial resesrchers direc t ly 


17 


(34%) 


Make the raw data ava"f 1 ah le to Te^eaTf^hpT*? anH d^i t^ ch i ves 


15 


(30%) 


Defend the representativeness and the value of the TALENT data base 


9 


(18%) 


Be more responsive to the data requirements of potential users, especially 
in t he 1 i>c- 1 1 on of th p Que s t ions a^lf pd in t hp su rvpvs 


5 


ClO%) 


Publish a bibliography and description of previous studies that used 
TALENT data 


it 


(8%) 


Improve contacts wi th funding a gene ies 


4 


(8%) 


Send dissemination packages to data resource authorities (department heads, 
data archivists, utc.) 


3 


(6%) 
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Some respondents said that they would like to find out what studies 
had already been done using TALENT data so thac they could better relate 
TALENT to their current research interests. The two reasons given for 
this were to avoid a repetition oE an existing study and to get some 
indication of the range of studies that are possible with TALENT data. 

A notable generalization of the suggested approaches to broadening 
the data bank usage is that most of the suggestions have already been 
implemented by TALENT staff, in one way or another. Without some clear 
idea of the costs and effects of the alternatives suggested, the sug- 
gestions do not appear to add substantially to our knowledge for making 
dissemination decisions. 

Funding of Research Using Large-Scale Data Bases 

The interviews conducted with representatives of Federal agenci*: 
indicated that an effort to make such agencies aware of tlie value of a 
data base is as important as the need to contact potential users* 
Agencies tend to be uninformed about data bases that they do not di- 
rectly sponsor, and they are not likely to include such data bases in 
their own funding plans. In addition, the current tendency is to fund 
the creation of data bases through open-bidding contracts and to fund 
secondary analyses through grants (Chlnitz, 1971) • This places a 
burden on prospective users of a data base in that they have to convince 
the funding agency of not only the value ot their research but also the 
validity of their data source. 

The education of government personnel about the Project TALENT data 
base definitely needs to be improved: as one respondent said, "I can't 
recall Project TALENT ever being mentioned in our program planning 
sessions." Although many agencies are receptive to the notion of 
TALENT-based research as part of their grant programs, there do not 
appear to be many agencies other than NIE who are actively planning to 
sponsor sucli studies through contract awards* This suggests that a 
careful review of appropriate grant programs, increased contact and 
consultation with the government officials responsible for the admin- 
istration of those programs, and communication to prospective users of 
likely sources of grant support would be the most effective strategies 
for increasing the likelihood of funding for TALENT-based research. 
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Summary and Recommendations 



This was a small, exploratory study designed to help the American 
Institutes for Research and the National Institute of Education decide 
upon an effective strategy for promoting the more widespread use of the 
Project TALENT data base^ As pointed out the Introduction, the 
Project TALENT data base, like other very large comimterized data bases 
in the social and behavioral sciences, has the potential for supporting 
research on so many alternative problems that the choice of problems to 
pursue is not straightforward^ it is most efficienr.ly made at least in 
part by researchers who conceptually "come to the data base** with a 
significant problem^ Only part of the potential can be realized by a 
single set of researcherij with their specific research agenda- There- 
fore, AIR has maintained the Project TALENT Data Bank service, through 
which nearly 150 studies have been performed by researchers outside AIR 
since 1960; and since 1975 NIE, through its Education and Work Division, 
has provided a., public use sample of Project TALENT data for studies 
relating education and work^ 

The purpose of the present study was to find out whether there was 
potential for much more widespread use of Project TALENT data, and if 
so, to assess the likely effectiveness of alternative plans for expand- 
ing TALENT usage while at the same time maintaining the privacy rights 
of the 400,000 TALENT participants. Was it possible that the vast 
majority of potential TALENT users had not heard of TALENT? Were those 
who were aware of TALENT unaware of the breadth of the data within it? 
Were there misconceptions about weaknesses that could beset a project 
like TALENT but have been dealt with by TALENT staff? Were there gen^ 
eral inhibitions against secondary analysis? Had the TALENT question- 
naires omitted critical data elements? Or were costs of data £ic<iuisi- 
tion a significant barrier? While it was not expected that this study 
would provide definitive answers to all these questions, it was hoped 
that interviews with a few dozen leading social researchers in various 
fields would provide a global, if not exact, perspective from which to 
proceed further- 
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There were four main results: (1) most lfjrg€i-sca.le survey re- 
searchers were aware of Project TALIiNT; (2) however, few other than 
those who had previously used Project TALENT data had more than a vague 
idea of what was in the data base, (3) There was substantial interest 
among resJearcherti In using TALENT data once a description was given, and 
(A) established researchers^ did not see the costs of data acquisition 
from Project TALENT as a significant barrier. 

There were numerous secondary results. Some researchers harbored a 
misconception that Project TALENT staff had not dealt adequately with 
the attrition problem; many suggestions were received for questions to 
be included in tlie next Project TALENT follow-up survey to make the data 
most useful to different individuals; a few researchers would not ser- 
iously consider secondary data analysis for themselves; for some re- - 
searchers the cost of data acquisition, even a few hundred dollars, 
presented a problem; and for some researchers, the need to become im- 
mersed in data before specifying exact analyses meant that they could 
not become data bank users without a sample of the data to explore. 

In some cases, these results have been anticipated in actions taken 
by AIR and NIE in the recent years of Project TALENT. For example, the 
need for small public use samples that can be loaned to potential data 
bank users for initial exploration is now satisfied with the production 
of a 4,000-case sel f^weigbted sample of respondents to the Project 
TALENT Il-year follow-up survey. In other cases, the results suggest 
actions to be taken (or not to be taken). 

A process model such as shown in Figure 1 provides a framework for 
summarizing and applyi.ig the results of this study to planning for 
increased usage. If the frequency of each step in the chain in Figure 1 
is increased, the total number of users will increase, but the most 
efficiency will be achieved by focusing dissemination efforts on the 
steps now occurring with lowest frequency* While this model is overly 
simplistic, it does provide a context for interpreting the percentages 
presented in Tables 1 and 2. 
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Figure 1. Processes leading to Data Bank usage 
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In order to decide whether reaching 10% ot 30% or 70% of the popu- 
lation is B proper goal for some dissemination activities, we consider a 
crude example. Using the four-step model in Figure 1, suppose that 
there were 10,000 researchers for whom the data were potentially val- 
uable. Now, if the conditional probability of each step's occurrence 
were ,10 (e,g+, 10% of those who reached the conclusion that the data 
were valuable for their research were then able to obtain funds), this 
would lead to a single user (1 = 10,000 x (,10)^), In the past decade, 
approximately 100 researchers have become users of Project TALENT data. 
Using the estimate of 10,000 potential users (we need a better estimate 
in order to refine this model), we arrive at an estimate of about +30 
for each of the four steps, on the average* Thus, steps for which we 
estimate the probability already to be higher than ,30 are not as impor- 
tant as barriers as are other steps* 

Concerning the first step, in the process model the general aware- 
ness of Project TALENT among potential research users was quite high; 
therefore, unless there are other potential users not tapped by this 
survey, efforts aimed at merely broadening awareness of Project TALENT 
need not be expanded* That does not mean they should be discontinued, 
of course, because general awareness is a function of the appearance of 
articles and reports and of oral presentations at conventions and would 
soon fade in the absence of these. 

The second step, forming a commitment to consider using TALENT 
data, is much more problematic* There are three barriers to be overcome 
in convincing a potential user to look carefully at the data and to 
apply the data to his/her problem if appropriate. First, there is the 
hesitancy to use somebody else's data and "merely" perform secondary 
analyses* Although this survey focused on researchers who had used 
survey data not unlike the contents of Project TALENT's data base, there 
were nevertheless quite a few responses that indicated an unwillingness 
to consider using the data* That attitude is not easily changed; how- 
ever, until it is, a large reservo ir. of potential users will remain 
untapped. One possible tactic to overcome this barrier is to select, 
say, a half dosen of the most promising areas and devote about two 
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person-years to production o£ articles for leading research journals 
that cse TALENT data to address problems of high current relevance. 
These articles could be written to include persuasive arguments for the 
importance, feasibility, and efficiency of secondary analyses* 

In addition to hesitancy t perform secondary analyses, chere is 
the general problem of inertia* Productive researchers have ongoing 
research agendas that may not allow time to consider alternatives such 
as Project TALENT* This is one reason that a large proportion of the 
recent Project TALENT data bank users have been gra^Juate students, A 
corollary has been that projects requiring substantial work for a small 
budget have been the rule, not the exception* The American Institutes 
for Research and the staff of the Project TALENT Data Bank have donated 
substantial efforts to helping these graduate student users, but without 
an independent source of support such efforts cannot be expanded and, in 
fact, will probably have to be discontinued. 

The problem of inertia may be best overcome by creating "small 
invescment" steps which busy researchers will be willing to undertake 
and which will provide a compelling argument for further interest in 
data bank usage* One step taken by Project TALENT staff in this area 
has been production of the report. The American Citizen: JLl Years After 
High School , in which potential researcher users can quickly find the 
response distributions to follow-up questions that might form the basis 
for a study and at the same time assess relations of these responses to 
variables measured in high school. Selective mailing of this document 
to researchers previously contacted by telephone and expressing interest 
in finding out about TALENT research results could prove fruicful. 

The third attitudinal problem, in addition to hesitancy to perform 
secondary analysis and inertia, is mistrust of the validity of the data* 
For long-term longitudinal analysis the most prevalent grounds for 
mistrust concern attrition* There appears to be a consensus in some 
circles that Project TALENT staff have not dealt adequately with attri-- 
tion, and that has led a number of researchers to avoid using Project 
TALENT* In fact, however. Project TALENT staff have dealt adequately 
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with attrition, and, as one respondent suggested, it is imperative to 
make that fact widely known. In a recent presentation to the American 
Educational Research Association (Wise, 1977), the deputy director of 
Project TALENT made a forceful argument that the TALENT subject- 
weighting scheme virtually eliminates nonresponse bias. This message 
should be repeated frequently to counteract a prevalent misconception. 

The third step in the process of becoming a data bank user concerns 
the objective evaluation of the appropriateness of the data for one's 
research problem. There appears to be a substantial barrier to use of 
the data when detailed information on the data base is noc directly in 
the researcher's grasp. In response to this need, the Project TALENT 
Data Bank Handbook has been produced and updated. That Handbook aims to 
convey all the information necessary to specify exactly what the data 
base contains. The necessity of charging a substantial amount for the 
Handbook, rather than being able to mail it free to those expressing 
interest in use of TALENT data, is a continuing problem that could be 
solved with a small investment. A subsidy by NIE or AIR for the print- 
ing and mailing of copies of the Handbook would appear to be a prudent 
tactic. This might amount to $1,000 per year. 

Of course, no handbook can supply information in enough detail to 
tell a potential research user that his/her research plan will be suc^ 
cessful using a particular data base: if the topic of the research 
involves a particular type of individual, the numbers of respondents of 
that particular type in the data base may be insufficient for the pro- 
posed research. One solution to the need for a chance to find out 
roughly the power of the data base for a particular research problem is 
to provide a small sample of the data which can be thoroughly explored 
at small cost. Project TALENT has, in late 1977, created a file of 
4,000 respondents to the 11-year follow-up survey to be used for such 
exploratory analyses. The respondents on the Exploratory Tape were 
selected so that no differential weighting would be necessary and so 
that the tape would, in general, be easy and inexpensive to use. It is 
the intent of Project TALENT to provide a copy of this file to qualified 
researchers for quick and inexpensive analyses for a fee that covers the 
reproduction of the file and documentation. 
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The fourth and final hurdle to be overcome in persuading a poten- 
tial data bank user to become a data bank user is the obtaining of funds 
to pay for data transfer and/or analyses. The findings of the preti'^nt 
survey suggest that a wide variety of perspectives exist with regard to 
funding* Conversations with representatives of funding sources outside 
NIE indicated neutrality and general lack of knowledge about Project 
TALENT as a data base for secondary analyses* Wliile Project TALENT was 
not being mentioned explicitly in RFFs or research program announce- 
ments, neither was there a prejudice against research conducted using 
Project TALENT data* 

The only obvious solution to this hurdle is to reduce costs to the 
potential user, either through greater efficiency or through subsidy* 
In view of the level of charges for projects in the recent past, there 
is no reason to believe that AIR could contribute substantially to 
reducing costs, other than in the ways mentioned above* Subsidy by a 
funding source does not appear viable either, except in terms of a 
favorable attitude concerning research proposals that aim to perform 
secondary analyses using Project TALENT data* Presentations should be 
made by Project TALENT representatives to assure that favorable atti" 
tude* 

Aside from these activities to deal with the particular barriers, 
one valuable tactic emerged from this survey; the mere conduct of the 
survey, writing and talking to leading researchers in relevant areas, 
appeared to substantially increase their personal interest in using 
Project TALENT data* Because of funding lags, the effect of this activ- 
ity in terms of data bank usage may not be felt for a year or more* On 
the basis of initial indicators, however, it would appear to be impor- 
tant to assign a small amount of effort to continued perusal of the 
relevant journals and selective contacting and interviewing of appro- 
priate researchers* 
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APPENDIX A 



DATA USER INTERVIEW GUIDE 

Review available information on each user (read journal article* note 
positions held, previous articles, etc.) 

Phone each user, based on best telephone number available, 

1, If better, phone number is needed, try to obtain it and phone again. 

2, If address has changed, get it, 

3, If user is not available, ask for best contact time or set appoint- 
ment to call. 

Conduct interview. Have brochure available for reference. 

Introduction: Hello ^ my name is and I am calling fTom 

Palo Alto^ California^ on behalf of Project TALENT^ a F&derallu 
supported data base. By now you should have received a brochure in 
the mil about Project TALENT. Has it arrived? 

[if yes, go to 1 , ] 

[if no, say this and go to 1:] 

We are conducting a survey of professional researchers to find out 
what factors will facilitate or inhibit researchers from using data 
l^ses like Project TALENT, We would appreciate your response to several 
short questions. 

Project TALENT started collecting information on 37S^000 high 
school students in 1960 to study their life development. Data included 
detailed information on their cognitive skills^ interests^ plans, family 
backgrounds, and aiirrent activities* These individuals were contacted 
again at age? 19, 23^ and 29 to obtain longitudinal data on their 
educational^ occupational, and social experiences. A fourth follow-up 
survey of respondents at age 3S is sche,duled to begin this fall. 

(1) Have you heard about Project TALENT before? How? 

(2) What is your current research interest? 

Are you (also) interested in education^ family, or career develop- 
ment? 

(3) What barriers do you see that would prevent a researcher such as 
yourself from ucing data bases like Project TALENT? 

[Probes; Chances of funding 

Availability of key variables 
Costs 

Unfamiliarity with TALENT 
Use of someone else's data] 

(U) What are your recommendations for increasing the use of data bases 
like Project TALENT? 

(5) We ars now formulating the questionnaire for our follow-up of 

respondents at age 35. What topical areas do you feel we should 
address? 

That completes my list of questions. Thank you very much for your time 
and cooperation. 




APPENDIX B 
FUNDING SOURCE INTERVIEW GUIDE 



A+ Review available informacion on each concact (read journal article, 
noce positions held, previous articles, eCcO 

B. Phone each concact, based on best telephone number available^ 

1. If better phone number is needed, try to obtain it and phone again^ 

2. If address has changed, get it^ 

3^ If contact is not available, ask for best contact time or set 
appointment to call^ 

C+ Conduct interview* Have brochure available for reference* 

Introduction; Hello^ my name is and I am calling from 

Palo AltOj California^ on behalf of Project TALEf^T^ a Federally 
supported data base* By nob) you should have received a brochure in 
the mail about Project TALENT. Has it arrived? 

[If yes, go to 1*] 

[if not say this and go to 1:] 

are ooruiuoting an informal survey of the federal research 
system to f/fid out what factors will facilitate or inhibit researchers 
from U3in{/ data bases like Project TALENT* ^e would appreciate your 
response to several short quer.tions* 

Project TALENT started collecting information on 375,000 high 
school atudents in 2960 to study th^ir liff^ development* Data included 
detailed inf07*mation on their cognitive skills^ interests, plans, family 
baakgvounds, and current aotivities* These individuals Wt^re contacted 
aga\ji at agei> 79, l^S, and 119 to obtain longitudinal data on th^ii^ 
eduiiatioyial. occupational, a^vl social ^-xperiencas * A fourth folto:o~up 
^urven of rf^sporuient'd at age fib scheduled to begin this fall* 

(1) Haae [jou heard about Project TALEaW before? How? 

(2) Do^i^ ijOur agency /organization fund t^esearch studies utilizing 
pxiating data bases like Project TALE.VT? 

(3) What barrierr io^you ^^p. that toould prevent ;jOt<r agem^j from 
zjponsoring projects using Project TALENT data? 

[Probes; Chances for funding 

Availability of key variables ^ 
Costs - — 
Unfamiliarity] 

(^) What rcijomendations would you tnake for us to increase the use 
of TALSNT data? [Probe for dissemination strategies, etc.] 

Thank ^ou ver^ much for your time and cooperation* 
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